Methods, systems, and product for hashing using twisted tabulation

ABSTRACT

Methods, systems, and products describe a robust solution for the dictionary problem of data structures. A hash function based on tabulation is twisted to utilize an additional xoring operation and a shift. This twisted tabulation offers strong robustness guarantees over a set of queries in both linear probing and chaining.

COPYRIGHT NOTIFICATION

A portion of the disclosure of this patent document and its attachmentscontain material which is subject to copyright protection. The copyrightowner has no objection to the facsimile reproduction by anyone of thepatent document or the patent disclosure, as it appears in the Patentand Trademark Office patent files or records, but otherwise reserves allcopyrights whatsoever.

BACKGROUND

This disclosure generally relates to communications and to cryptographyand, more particularly, to network routing, to congestion reduction ofdata, and to algorithmic function encoding.

Monitoring of data networks is desired. Network operators monitor theperformance of clients to identify any problems, including securityissues, reliability concerns, and performance bottlenecks. An Internetrouter, for example, classifies packets of data with hash tables. If thehash tables cannot keep pace with Internet traffic, data will be lost.The router must therefore be monitored to ensure its worst-caseperformance meets minimum targets.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

The features, aspects, and advantages of the exemplary embodiments arebetter understood when the following Detailed Description is read withreference to the accompanying drawings, wherein:

FIG. 1 is a simplified schematic illustrating an environment in whichexemplary embodiments may be implemented;

FIG. 2 is a more detailed schematic illustrating the operatingenvironment, according to exemplary embodiments; and

FIG. 3 is a generic block diagram of a processor-controlled device,according to exemplary embodiments.

DETAILED DESCRIPTION

The exemplary embodiments will now be described more fully hereinafterwith reference to the accompanying drawings. The exemplary embodimentsmay, however, be embodied in many different forms and should not beconstrued as limited to the embodiments set forth herein. Theseembodiments are provided so that this disclosure will be thorough andcomplete and will fully convey the exemplary embodiments to those ofordinary skill in the art. Moreover, all statements herein recitingembodiments, as well as specific examples thereof, are intended toencompass both structural and functional equivalents thereof.Additionally, it is intended that such equivalents include bothcurrently known equivalents as well as equivalents developed in thefuture (i.e., any elements developed that perform the same function,regardless of structure).

Thus, for example, it will be appreciated by those of ordinary skill inthe art that the diagrams, schematics, illustrations, and the likerepresent conceptual views or processes illustrating the exemplaryembodiments. The functions of the various elements shown in the figuresmay be provided through the use of dedicated hardware as well ashardware capable of executing associated software. Those of ordinaryskill in the art further understand that the exemplary hardware,software, processes, methods, and/or operating systems described hereinare for illustrative purposes and, thus, are not intended to be limitedto any particular named manufacturer.

As used herein, the singular forms “a,” “an,” and “the” are intended toinclude the plural forms as well, unless expressly stated otherwise. Itwill be further understood that the terms “includes,” “comprises,”“including,” and/or “comprising,” when used in this specification,specify the presence of stated features, integers, steps, operations,elements, and/or components, but do not preclude the presence oraddition of one or more other features, integers, steps, operations,elements, components, and/or groups thereof. It will be understood thatwhen an element is referred to as being “connected” or “coupled” toanother element, it can be directly connected or coupled to the otherelement or intervening elements may be present. Furthermore, “connected”or “coupled” as used herein may include wirelessly connected or coupled.As used herein, the term “and/or” includes any and all combinations ofone or more of the associated listed items.

It will also be understood that, although the terms first, second, etc.may be used herein to describe various elements, these elements shouldnot be limited by these terms. These terms are only used to distinguishone element from another. For example, a first device could be termed asecond device, and, similarly, a second device could be termed a firstdevice without departing from the teachings of the disclosure.

FIG. 1 is a simplified schematic illustrating an environment in whichexemplary embodiments may be implemented. FIG. 1 illustrates aclient-server network architecture that monitors data traffic. A server20 communicates with a client device 22 via a communications network 24.The server 20 sends a stream 26 of data to the client device 22. Thestream 26 of data may include any content, such as a movie, music, call,or any other data. Regardless, the stream 26 of data is routed by arouter 28. The router 28 receives and forwards the stream 26 of data anaddress associated with the client device 22. FIG. 1 illustrates therouter 28 executing a monitoring application 30. The monitoringapplication 30 is a software algorithm that monitors the performance ofthe server 20 and/or the communications network 24. The monitoringapplication 30 extracts many different data records 32 and stores thedata records 32 in a streaming database 34. The monitoring application30 may query the streaming database 34 for the data records 32, and themonitoring application 30 analyzes the data records 32. The monitoringapplication 30 may then generate reports and/or alarms indicative of theperformance of the server 20 and/or the communications network 24. WhileFIG. 1 illustrates the router 28 executing the monitoring application30, the router 28 is only an exemplary hardware component that monitorsperformance. The monitoring application 30 may be executed by anynetwork component.

FIG. 2 is a more detailed schematic illustrating the operatingenvironment, according to exemplary embodiments. The router 28 has aprocessor 50 (e.g., “μP”), application specific integrated circuit(ASIC), or other component that executes the monitoring application 30stored in a memory 52. The monitoring application 30 may cause theprocessor 50 to produce a graphical user interface (“GUI”) 54. Thegraphical user interface 54 is illustrated as being visually produced ona display device 56, yet the graphical user interface 54 may also haveaudible features. The monitoring application 30, however, may operate inany processor-controlled device, as later paragraphs will explain.

The monitoring application 30 classifies the data records 32. Themonitoring application 30, for example, may use linear probing 60 toclassify the data records 32. The monitoring application 30 thusaccesses one or more hash tables 62, which are known features of thelinear probing 60. The hash tables 62 may be generated by a hashfunction 64, which is stored in the memory 52 and executed by theprocessor 50. The hash function 64 may be a separate module or portionof the monitoring application 30. The hash table 64, for example, may bestored in a buffer or cache portion 66 of the local memory 52, but thehash table 62 may be remotely accessed and maintained at any location inthe communications network (illustrated as reference numeral 24 in FIG.1). An example of the monitoring application 30 is the GIGASCAPE®application for monitoring network traffic.

Regardless, the one or more hash tables 62 are often bottlenecks inprocessing data streams, such as the stream 26 of data and otherInternet traffic. Applications using the hash tables 62 are often timecritical, in the sense that if the hash tables 62 cannot keep up withthe traffic, then data will be lost. Keeping up with traffic requiresboth good average throughput over the length of the stream 26 of data(e.g., good amortized performance) and fast handling of individual dataitems (e.g., worst-case performance). Interest in worst-case performanceis motivated not only by the real-time requirements of such systems, butalso by the worry of adversarial attacks: if an adversary can accuratelytime the performance of the system of various data items, it couldconceivably mount a denial-of-service attack by exploiting key valuesthat the hash tables 62 are unusually slow in handling.

Conventional hashing techniques may thus utilize back-up hash tables. Tohandle the occasional slow queries in a main hash table, conventionalhashing techniques may establish a back-up hash table. Whenever thenumber of probes made by an insertion passes a certain constantthreshold, the data item is deferred to a secondary hash table. The goalor hope is to keep the secondary back-up hash table sparse enough suchthat no operation will require high running time. Unfortunately, the useof a second layer of hashing to improve robustness complicates theimplementation and slows down the queries (which must always look upkeys in two hash tables).

Exemplary embodiments reduce, or even eliminate, secondary hashing. Theperceived non-robustness of linear probing is not inherent, but ratheran artifact of using weak hash functions. Exemplary embodiments thusshow that a strong hash function 64, backed up by an appropriatemathematical analysis, improves the state of the art among practicalimplementations of robust hash tables.

Exemplary embodiments may utilize short-range amortization. Robustnessmay utilize the following overlooked property of simple hash tables:

-   -   Observation 1: In chaining and linear probing with constant        load, any window of L=θ(lg n) operations on distinct keys takes        time θ(L) w.h.p.        The proof of this observation is rather elementary and will be        implicit in the technical analysis below. Conceptually,        Observation 1 shows that chaining (illustrated as reference        numeral 70 in FIG. 2) and the linear probing 60 can be used,        with nearly-ideal robustness guarantees, in any streaming        application that can afford a buffer of size Ω(lg n). Real-time        systems (such as the network router 28) are normally implemented        with cyclic buffers, so the short-window amortization, of        Observation 1 is an implicit property of the system that comes        at no cost.

The conceptual message of Observation 1 is the following: instead ofdesigning a (complicated) hash table 62 that guarantees O(1) worst-caseperformance, exemplary embodiments may use simple and practical hashtables by augmenting the system design with a very short buffer. Notethat this conceptual observation does not close the problem of practicalrobust hash tables, but merely rephrases the question from the design ofhash tables to the design of hash functions. The simplicity andefficiency of the linear probing 60 or chaining 70 can only be enjoyedwith a similarly efficient and elegant hash function 64 that makesObservation 1 “come true.” Further discussion of this hash functiondesign problem is reserved for later paragraphs.

A potential concern about the robustness guarantee of Observation 1 isthat it may require that the operations in the window be on distinctkeys. This is clearly needed for any concentration-type property: forinstance, L copies of the same unsuccessful query will run exactly Ltimes slower than a single query. If we are willing to augment the hashtable 62 to support repeated keys in a short window, many simplesolutions suggest themselves. The simplest is to store the last L=θ(lgn) keys in a secondary table of size n^(ε), which guaranteesconstant-time operations w.h.p.

To maintain a practical, pragmatic stance, however, these solutionswould burden the implementation with no real benefits. Indeed, animportant feature of modern hardware is caching, and the running time ofan algorithm is normally dominated by memory accesses outside the cacheportion 66 of the local memory 52. In both linear probing 60 andchaining 70, two operations to the same key will access the exact samememory locations, except for possible a few locations affected byinsertions and deletions intervening between the two operations. Thismeans that if the processor 50 offers a cache 66 of nontrivial size,i.e. larger than L=θ(lg n) by an appropriate constant factor (a veryrealistic assumption), then any repeated key in a short window willactually incur a negligible running time, since all necessary data isalready available in cache. The inventors believe this completelyresolves the issue of repeated keys in a small window, and thus thismatter is ignored.

The performance of an individual operations in the linear probing 60 hasbeen the subject of intense theoretical investigations in publishedliterature. Unfortunately, from a robustness perspective, the conclusionis negative: we expect frequent bad performance. Here the inventorsswitch to the study of a window of log n operations, showing that thecombined performance is robust. Studying log n operations offers obviouschallenges for implementable hashing schemes, e.g., in the classicnotion of O(1)-independence, the inventors have the issue that the keysin the window are not hashed independently, yet the inventors have, toshow that bad performance with one key does not correlate too stronglywith bad performance for the other keys.

The analysis so far highlights an interesting general target(“short-range amortization”) in the analysis of streaming algorithms inthe context of real-time performance guarantees: if one can show thatthe running time of the algorithm has a very robust behavior over ashort window of stream items, then by simply augmenting the systemdesign with a large enough buffer, one can avoid the design of morecomplicated algorithms that might guarantee a worst-case time bound perstream item. As a prime example of this analysis target, the inventorsconcentrate on one of the most fundamental data structure problems: thedictionary problem. The goal is to use classic, realistic hash tables 62such as chaining and linear probing to achieve robustness guaranteescomparable to more complicated data structures that were speciallydesigned for robustness.

Note that short-range amortization may not settle the question ofrobustness for chaining and linear probing, but merely rephrases it intoan interesting question about hash function design. Indeed, the promiseof simplicity and practical efficiency of using linear probing/chainingwith robustness guarantees is only realized if an equally simple andpractical hash function 64 can be used to implement these schemes andmaintain the robustness guarantee.

The inventors may thus start with the following intriguing questionabout replacing the assumption of truly random hashing by an explicithash function:

-   -   Question 1: Can one design a simple and practically efficient        hash function preserving the guarantees of Observation 1?        The standard theoretical paradigm used to analyze explicit hash        functions is the notion of k-independence. In our case, it is        standard to show that Observation 1 continues to hold with        O(L)-independent hash functions, i.e. with θ(lg n)-independence.        Unfortunately, known hash functions that guarantee θ(lg        n)-independence fail to address our basic question, as they are        neither simple nor realistic in a practical implementation. Note        that all solutions besides polynomial hashing use tabulation        (illustrated as reference numeral 80), i.e. they use memory that        is at least a root of the universe size. The use of tabulation        techniques for strong hash functions is almost universal in the        literature, and is, to some extent, justified by a lower bound.        This lower bound states that any ω(1)-independent family of hash        functions with O(1) evaluation time requires space u^(Ω(1)).

Fortunately, tabulation-based techniques are not incompatible with thegoal of designing practically efficient hash functions. By choosing anappropriate parameter c, tables of size u^(1/c) can be made to fit infast cache, making evaluation very efficient. Among the surprisingsuccess stories in the literature, mention is made of the 5-independenttabulation family, which offer an order-of-magnitude speed-up comparedto the fastest known implementation of 5-independent polynomial hashing.

An important canonical example of tabulation-based hashing is simpletabulation. In this hash function 64, a key xε[u] is interpreted as avector of c characters from Σ=[u^(1/c)], i.e. x=(x₁, . . . ,x_(c))εΣ^(c)=[u]. The hash function 64 is initialized by c tables T₁, .. . , T_(c) of |Σ| random values (in the desired output range), and thehash function 64 is evaluated by looking up each character in its owntable and xoring the results:h(x)=⊕_(i=1) ^(c) T _(i) [x _(i)].

Simple tabulation is only 3-independent. However, the inventors havedemonstrated that by stepping outside the k-independence paradigm andanalyzing simple tabulation directly in applications, it is possible toprove much stronger properties than this independence would suggest. Forexample, if simple tabulation is used with linear probing in a table offill 1-ε, the expected running time of an operation is O(1/ε²), the sameguarantee achieved by truly random hashing.

This ideal performance for an individual query may raise the hope thatsimple tabulation also answers the above design challenge, byguaranteeing good performance on a window of queries. Unfortunately, itdoes not, as the following simple counterexample shows:

-   -   Observation 2. Consider a linear probing table with fill ½        implemented with simple tabulation hashing. There exists an        adversarial set of keys that can be inserted into the table, and        a set of L=lg n queries such that the running time of the        queries exceeds

$\Omega\left( {L \cdot \frac{\lg\mspace{14mu} n}{\lg\mspace{14mu}\lg\mspace{14mu} n}} \right)$with probability at least 1/n^(ε).Proof: The construction is simply L “parallel” key sets. First insertinto the hash table 62 the key set [L]×[n/L]. At the end, execute thequeries [L]×{0}. By a simple counting argument, there is a probabilityof 1/n^(ε) that

$\Omega\left( \frac{\lg\mspace{20mu} n}{\lg\mspace{14mu}\lg\mspace{14mu} n} \right)$consecutive table positions following h((i, 0)) will be filled by keysfrom {i}×[n/L]. If this happens for a certain iε[L], it will happen forall values of i, since the relative positions of the keys in {i}×[n/L]is the same up to a common shift by T₁[i]. This means that the queriesare maximally correlated, and they can all become slower by an almostlogarithmic factor simultaneously, with a nontrivial probability. Thesituation can be extended over a long stream of queries, e.g. querying(i, j) for all jε[n/L] and all iε[L]. We expect one in every n^(ε)windows of queries to deviate significantly from the expectation, by anearly logarithmic factor. This is no better than robustness of anindividual query, which also deviates from the mean by a logarithmicfactor with n^(−ε) probability.

Exemplary embodiments twist the tabulation 80. The below paragraphs showthat extending the guarantees of simple tabulation for a single query torobust guarantees on a window of queries only requires a very simpletwist to the hash function 64, which preserves the simplicity andpractical efficiency of simple tabulation.

In a key x (illustrated as reference numeral 82 in FIG. 2), the firstcharacter x₁ will play a special role, and will be called the head,head(x)=x₁ (illustrated as reference numeral 84). The rest of the key 82is called the tail, tail(x) (illustrated as reference numeral 86).Conceptually, the hash function 64 is initialized by 2c−1 random tables:

-   -   c−1 twist tables T*₂, T*₃, . . . , T*_(c), each of size Σ        containing random values from Σ.    -   c tables as in simple tabulation, T₁, . . . , T_(c), each of        size Σ containing random value from the desired output range        [m].        The hash function 64 begins by twisting (illustrated as        reference numeral 88 in FIG. 2) the head 84 of the key 82        according to the twisted hash code of the tail 86, and then        evaluating simple tabulation on the twisted key:

$x^{*} = \left( {{x_{1} \oplus \left( {\underset{i = 2}{\overset{c}{\oplus}}{T_{i}^{*}\left\lbrack x_{i} \right\rbrack}} \right)},x_{2},\ldots\;,x_{c}} \right)$${h(x)} = {\underset{i = 1}{\overset{c}{\oplus}}{T_{i}\left\lbrack x_{i}^{*} \right\rbrack}}$

Though we have described the hash function in terms of 2c−1 tables and2c−1 memory lookups, it can be seen that only c tables and lookups areneeded. Indeed, we can combine T*_(i) and T_(i) into a common tableT′_(i) with entries of lg Σ+lg m bits. The hash function begins byxoring the entries corresponding to the tail, ⊕_(i=2) ^(c)T′_(i)|x_(i)|.The low-order lg Σ bits of the result are xored with head(x), afterwhich we make the final lookup into T₁ for the hash code of the twistedhead.

Thus, the implementation of twisted tabulation is essentially parallelto simple tabulation, requiring just one addition xor and one shift. Itis known that, with a sensible choice of c, the practical efficiency ofthis scheme leaves nothing to be desired. On current architectures, theevaluation time turns out to be competitive to just one 64-bitmultiplication, which can be considered the ultimate target for any hashfunction (since even the simplest universal hashing requiresmultiplication).

Despite the simplicity of this new hash function 64, an appropriateanalysis reveals that it offers strong robustness guarantees over a setof queries in both the linear probing 60 and chaining 70. We will provethe following theorems:

Theorem 2. In chaining implemented with twisted tabulation, any windowof

$L = {\Theta\left( {\left( {\log\mspace{14mu} n} \right)/\left\lceil \frac{n}{m} \right\rceil} \right)}$operations on distinct keys has total cost

$O\left( {L\left\lceil \frac{n}{m} \right\rceil} \right)$with high probability in n.Theorem 3. Consider linear probing implemented with twisted tabulation,and let the fill be

$\frac{n}{m} = {1 - ɛ}$where ε≧1/n^(o(1)). Any window of L≧lg n operations on distinct keyswill have total cost O(L/ε²) with high probability in n.The reader may observe that short-range amortization with twistedtabulation loses nothing compared to the performance of a single query,and we recover the optimal dependence on the fill achieved by trulyrandom hashing. Thus, twisted tabulation offers a simple and efficientsolution that losslessly transforms single-operation performance undertruly random hashing into robust amortized performance over shortwindows.

An analysis is now presented. The following known theorem captures someof the fundamental properties of simple tabulation that we shall reusein many places.

Theorem 4 (Simple Tabulation). Consider hashing n balls intom≧n^(1-1/(2c)) bins by simple tabulation. Let q be an additional queryball, and define X_(q) as the number of regular balls that hash into abin chosen as a function of h(q).

${{Let}\mspace{14mu}\mu} = {{E\left\lbrack X_{q} \right\rbrack} = {\frac{n}{m}.}}$The following probability bounds hold for any constant γ:(∀)δ≦1:Pr[|X _(q)−μ|>δμ]<2e ^(Ω(δ) ² ^(μ)) +m ^(−γ)  (1)(∀)δ=Ω(1):Pr[X _(q)>(1+δ)μ]<(1+δ)^(−Ω((1+δ)μ)) +m ^(−γ)  (2)For any m≦n^(1-1/(2c)), every bin getsn/m±O(√{square root over (n/m)}log^(c) n).  (3)keys with probability 1−n^(−γ).A caveat of Theorem 4 is that it only concerns the performance relativeto a single query key, stating that from its perspective, things areessentially as good as with a perfectly random hash function 64. Asrevealed by the counterexample above, we know this is inherent forsimple tabulation. We can construct a set of log n parallel universes,each with a single key 82, and so that the possibly bad performance ofthat key 82 is repeated in all universes.

We begin by observing that twisted keys remain distinct: if x≠y, thenx*≠y*. Indeed, if tail(x)≠tail(y), the keys 82 are distinct because thetails 86 are not twisted; otherwise the twist added to the head 84 isthe same, so the heads 84 (which must originally have been distinct)remain distinct. The main property that we shall is that we are, w.h.p.,in the following situation:

Property 5. We have a set S of twisted keys and a disjoint set Q oftwisted query keys. Let n=|S|+|Q|. Let φε(0,1] be a constant parameterto be determined, and assume that n≧Σ^(φ) and |Q|≦Σ^(φ/3). Now for everycharacter aεΣ:

-   -   (i) There are O(1+n/Σ^(φ)) keys from S with head a.    -   (ii) There are at most O(1) query keys from Q with head a.        Proof: We show that twisted tabulation satisfies the property        with        φ=⅔ w.h.p. in Σ.        First we argue that when we twist distinct keys x and y, we get        distinct twisted keys x* and y*. It is only the heads 84 that        get twisted, so if x and y differ in the tails 86, then so do x*        and y*. Hence we may assume that they only differ in the heads        84 while they have a common tail z. In this case, the twisted        heads 84 are        head(x)⊕hs ₀(z) and head(y)⊕hs ₀(z),        so with common tails 86 and differing heads 84, we get differing        twisted heads.    -   Lemma 6. Let T be a set of at most Σ^(2/3) keys. When se twist        these keys, then, due in Σ, each twisted head is shared by O(1)        keys front T.        Proof: A similar argument was used in the literature, but        exemplary embodiments take the twisting 88 into account.        Consider the set A of keys that end up with a given twisted        key a. Above we just proved that with common twisted heads 84,        the tails 86 must be different, so the keys in A have distinct        tails 86. The twisting of the heads 84 in A is based on simple        tabulation of these distinct tails. From the literature it is        known that we can find a subset        B⊂A of size max{|A| ^(1/(c-1)) ,lg|A|},        so that simple tabulation maps the tails from B independently.        For each xεB, we have        head(x′)=hs ₀(tail(x))⊕head(x)=a.        Hence, for the given set B, the combined probability of the        common twisted head a is        1/Σ|B|.        With the set size b=|B| fixed, the probability of any such set        is        (_(b) ^(Σ) ^(2/3) )Σ/Σ^(b) ≦b ^(b)/Σ^(b/3+1).        With a large enough b=θ(1), we conclude, w.h.p. in Σ, that this        does not happen for any subset B⊂T of size b, but then this also        limits the size of A.

Applying Lemma 6 to the small query set Q, we immediately get property(ii). To prove (i) we partition S∪Q arbitrarily into O(1+n/Σ^(2/3)) setsS_(i) of size θ(Σ^(2/3)). Each Set S_(i) contributes a constant to eachtwisted head, so in total, each twisted head is common to O(1+n/Σ^(2/3))keys from S. This completes the proof that twisted keys, w.h.p. in Σ,satisfy Property 5 with φ=⅔.

The chaining 70 is now discussed. As a warm-up illustrating some of thebasic ideas, we handle chaining assuming that Property 5 holds for ourtwisted keys.

Theorem 7: Assuming Property 5, if we amortize over windows with morethan (log n)/(1+n/m) operations on distinct keys, then w.h.p. in n, thecost per operation is O(1+n/m).

The bound of Theorem 7 does not benefit from n≦m, so adding dummy keys,we can assume n≧m. In the proof of Theorem 7, we first describe all therelevant consequences of Theorem 4. The interesting new part, that doesnot hold for simple tabulation, is captured by Lemma 8 below, which willalso be used in our study of linear probing.

First, by Theorem 4, if n=Ω(m log n), then w.h.p. in n, all bins/chainshave O(n/m) keys, and then every operations take O(n/m) time. We cantherefore assume that n/m=o(log n). We are studying the variable X_(Q)denoting the number of keys from S ending up in the same bins as thequery keys. The counting is with multiplicity if we have multiplequeries in the same bin. We want to show that X_(Q)=O(|Q|) with w.h.p.

Exemplary embodiments may divide the keys into groups depending on thehead character. A query group is one that contains a query key. We let Rdenote the family of query groups and

denote the set of all non-query groups. Their contributions to X_(Q) aredenoted X

_(,Q) and

_(,Q), respectively.

Exemplary embodiments may first fix the hashing of all the tails.Applying Theorem 4 to each group G, w.h.p., we get that each bin getsonly a constant number of keys from G in each bin. By the union bound,this holds for all groups. Note that the total number of keys in all thequery groups from R is bounded byΣ^(φ/2) n/Σ ^(φ) =n/Σ ^(Ω(1)) keys,so applying Theorem 4, we conclude, w.h.p., that the query groupsdistribute with only a constant number of keys in each bin. Thecontribution X

_(Q) of R to X_(Q) is then a constant per query, as desired. We alsonote that we have only a constant number of queries per query group, soour multiplicities are constant. The more interesting thing is how thenon-query groups distribute in the query bins.Lemma 8. Suppose we have m≧n/Σ^(φ/3) bins. After we have fixed thehashing of all tails and all query heads, w.h.p., no matter how we fixthe head of any non-query group Gε

, the groups contribution X_(G,Q) to the query bins is O(1). If multiplequeries am in the same bin, the keys from G in this bin are counted withmultiplicity.Proof: There are less than n groups, so by the union bound, it sufficesto prove the high probability for an arbitrary group G. Likewise, foreach group there are only m hash values, so it suffices to prove highprobability for any given one, i.e., the situation where the hash of Gis completely fixed with O(1) keys in each bin. Thus, we will show thatthe contribution

_(Q) from G is constant w.h.p.

Independent random hashing of the query heads is performed. A querygroup A has O(1) queries, each ending in a bin with O(1) keys from G.Hence the contribution from A to

_(,Q) is O(1). Thus

_(,Q) is the sum of independent contributions bounded by some constantd. MoreoverE[X _(G,Q) ]=|G∥Q|/m≦n/σ ^(φ)·Σ^(φ/3)/(n/Σ ^(φ/3))≦1/Σ^(φ/3).It follows w.h.p. in Σ that

_(,Q)=O(d).

By Lemma 8, after we have hashed the query groups and all tails, eachX_(G,Q) is an independent O(1) contribution to

_(Q). Moreover E[

_(Q)]=

|G∥Q|/m≦n|Q|/m=O(log n). It follows by Chernoff bounds, w.h.p., that

_(,Q)=O(log n)=O(|Q|n/m). This completes the proof of Theorem 7.

In preparation for linear probing, a summary so far is presented with aslight strengthening. When exemplary embodiments hash into m bins, thebins are indexed 0, 1, . . . m−1. We consider bin i and i+1 neighbors,wrapping around with neighbors 0 and m−1. With these bins, the triplebin of a query is the query bin and the two neighboring bins. It caneasily be checked that considering the triple bins of the queries ratherthan just the query bins can only change the constants.

Proposition 9. Suppose we have m≧n/Σ^(φ/3) bins. After we have fixed thehashing of all tails and all query heads, w.h.p., the query groupscontribute O(1) to the triple bin of each query. Moreover, no matter howtoe hash the head of any non-query group Gε

, the groups combined contribution X_(G,Q) to the triple bin of all thequeries is O(1).

The linear probing 60 is now discussed. We show here that the linearprobing 60 is very robust if we use it with twisted tabulation hashing,or any other scheme using simple tabulation on twisted keys satisfyingProperty 5 with high probability.

Theorem 10, Assume Property 5 holds for keys in a linear probing tablewith fill of a=n/m=(1−ε) where ε≧1/n^(o(1)). If we amortize over windowswith more than log n operations on distinct keys, then w.h.p. in n, thecost per operation is O(1/ε²).

The bounds of Theorem 10 are new and tight even for perfectly randomhash functions. The bounds of Theorem 10 do not benefit from ε≦½, soadding dummy keys, we can assume 2n≧m. Others have proved that theexpected cost per operation in linear probing is θ(1/ε²). Our Theorem 10states that the expected cost is achieved within a constant factorw.h.p. as soon as we amortize over (log n) operations.

In linear probing all elements are kept in a single array with entries[0;m). Adding keys one by one, we place a key q in the first emptyposition starting from h(q). The positions that get filled this way doesnot depend on the order in which keys are inserted. To bound the cost ofoperations with a key q, including deletes, we consider the situationwhere q is already inserted. The immediate cost is the length R_(q) ofthe run from h(q) to the first empty position. For upper bounds it ismore convenient, however, to study the length X_(q) of the filled runI_(q) around h(q) between the empty slots on either side. TriviallyX_(q)≧R_(q). The nice combinatorial property of I_(q) is that exactlyX_(q)=|I_(q)| keys hash directly into I_(q).

The basic result from the literature on linear probing with ε≦½ was thatfor any given key q, w.h.p.Pr[X _(q) ≧x]≦2e ^(−Ω(ε) ² ^(x)).  (4)This implies X_(q)=O((log n)/ε²) w.h.p. Here we show that for a set Q ofquery keys, w.h.p.,

$\begin{matrix}{{\Pr\left\lbrack {{\sum\limits_{q \in Q}\; X_{q}} \geq x} \right\rbrack} \leq {2^{{O{({Q})}}_{{\mathbb{e}}^{- {\Omega{({ɛ^{2}x})}}}}}.}} & (5)\end{matrix}$From (5) with |Q|≧log n and a large enough x=θ(|Q|/ε²), we get Σ_(qεQ)X_(q)=O(|Q|/ε²) w.h.p., which is the statement of Theorem 10.

We now relate to Proposition 9. We know from (4) that, w.h.p. the filledinterval I_(q) around each query h(q) is of size X_(q)=O((log n)/ε²).Since Σ^(φ)=((log n)/ε)^(w(1)), the remark below implies that all theconstant contribution bounds from Proposition 9 apply to the I_(q).

Remark 11. With m′=θ(n/Σ^(φ/3)), the triple bin of qεQ from Proposition9 contains the interval [[h(q)−θ(Σ^(φ/3)),h(q)+θ(Σ^(φ/3))].

With the constant contribution bounds of Proposition 9, we get to useChernoff bounds as we would for fully random hash functions. Indeed, theremaining proof of Theorem 10 is essentially the same as it would be inthe fully-random case. However, even in the fully-random case, Theorem10 was not known, and for ε=o(1), the proof becomes rather delicate.There are essentially two sources behind the factor 2^(O(|Q|)). One isto imply that we are willing to pay O(1) per key. The more essentialsource has to do with the number of ways that the total contributionX _(Q) =Σ_(qεQ) X _(q)to the query intervals can be distributed on the individualcontributions X_(q). We get an exponential bound if we only need to knowthe individual contributions within a constant factor. Formally, we usethe following simple combinatorial lemma:Lemma 12. There is a universal family W_(l) of 2^(O(l)) weight vectors{right arrow over (w)}=(w₁, . . . , w_(l)) where Σw_(i)>¼ and for all i,either w_(i)≧1/(2l) or w_(i)=0, and such that for every vector {rightarrow over (z)}=(z₁, . . . , z_(l)) with Σz_(i)=1, there is a (w₁, . . ., w_(l))εW_(l) with w_(i)≦z_(i)<2w_(i) for all i with w_(i)>0.Proof. First we show how a trivial map from a given (z₁, . . . , z_(l))to (w₁, . . . , w_(l)). Later we bound the size of the range W_(l). Foreach i, if z_(i)<1/(2l), we set w_(i)=0, otherwise, we round z_(i) downto the nearest negative power of two to get w_(i). The cases z_(i) withw_(i)=0 can add up to at most ½ and in the remaining cases, we lose lessthan a factor 2, so Σ_(i)w_(i)>((Σ_(i)z_(i))−½)/2=¼.

Let p be the smallest integer such that 2^(p)>2l. Then for each i thereis a, j={0, . . . , p−1} such that w_(i)=2^(j-p). To describe (w₁, . . ., w_(l)), we first have a base bit vector with l bits, telling whenw_(i)>0. In addition, for j=0, . . . , p, we have a step-up bit vectorthat for each w_(i)>2^(j−p) tells if w_(i)>2^(j+1-) p. All these bitsdescribe a unique {right arrow over (w)}. Since Σ_(i)w_(i)≦1, the numberof bits in step-up vector j is less than ½^(j−p)<4l/2^(j), so the totalnumber of bits is l+Σ_(j=0) ^(p)(4l/2^(j))<9l. Thus |W_(l)|<2^(9l).

We will use W_(Q), to denote W_(|Q|) but using the queries qεQ andindices. Now, suppose we want to bound the probability thatX_(Q)=Σ_(q)X_(q)=θ(x) for some value x=Ω((log n)/ε². This probabilitywill decrease exponentially in x, so we will automatically get a boundfor X_(Q)=Ω(x). It is, however, crucial that we also have an upper houndon X_(Q) before we focus on the individual contributions X_(q). UsingLemma 12, we know that W_(Q) contains a vector (w_(q))_(qεQ) such thatw_(q)>0 implies X_(q)=θ(w_(q)x). Using the union bound, we consider one(w₁, . . . , w_(l)) at the time. We know W_(|Q|)=2^(O(|Q|)), so to prove(5), it suffices to prove, w.h.p.,Pr[∀w _(q)>0:X _(Q)=θ(w _(q) x)]=2^(O(|Q|)) e ^(−Ω(ε) ² ^(x)).  (6)From (4) we got X_(q)=O((log n)/ε²) w.h.p., so we can assume thatx_(q)=w_(q)x=O((log n)/ε²) for all qεQ. We are looking for an intervalI_(q) of length X_(q)=θ(x_(q)) with at least |I_(q)| keys hashing to it.If I_(q) was fixed, the expected number of keys hashing to I_(q) is only(1−ε)|I_(q)|.

As stated in Remark 11, we inherit all the constant contributions boundsfrom Proposition 9. In particular, it follows, w.h.p., that the querygroups combined only contribute a constant number of keys toI_(q)⊂[h(q)−θ(Σ^(φ/3)),h(q)+θ(Σ^(φ/3))].

Let a=O(1) bound the maximal contribution from the query groups to anyquery interval. Then X_(q)≦X_(G,I) _(q) +a where

,_(I) _(q) is the contribution of the non-query groups Gε

to the query interval I_(q). We now restrict our attention to intervalsI_(q) of length at least 2a/ε. Then to fill I_(q) with X_(q)≧|I_(q)|, weneed

_(,I) _(q) to be bigger than(1+ε/2)a|I_(q)|=(1+ε/2)(1−ε)|I_(q)|<|I_(q)|−<a.

The lower bound 2a/ε on |I_(q)| does not affect our proof of (6), for itsuffices to consider x≧2b|Q|/ε² for a large enough constant b. Bydefinition of W_(Q), we have w_(q)≧1/(2|Q|) if w_(q)>0, so we getx_(q)=xw_(q)≧b/ε². For large enough b, this impliesa/ε<|I_(q)|=θ(x_(q)). With {right arrow over (w)} fixed, we only careabout queries q, with w_(q)>0. In our formulas below, we will simplyassume that Q has been restricted to such queries. Technically, thiscould leave some query groups with zero queries, but that does notmatter. Thus we can assume∀qεQ:x _(q) >b/ε ² for any fixed constant b.  (7)Having discounted the effect of the query groups, from now on, werestrict our attention to the fill from the non-query groups in

. For δ=ε/2, we are looking for an interval I_(q) of size θ(x_(q)) sothat the contribution

_(I) _(q) from the non-query groups in

to I_(q) is at least (1+δ)a|I_(q)|. If I_(q) was fixed, the expectedvalue of

_(,I) _(q) would be bounded by a|I_(q)|. Our problem is, of course, thatI_(q) could be anywhere as long as it has length θ(x_(q)) and containsh(q).ε=Ω(1). Before considering the case of small ε, we consider the easiercase where the fill is bounded from 1, that is ε=Ω(1). Then from [PT11],we know combinatorially that having (1+δ)a|I_(q)| keys in I_(q)∃h(q)implies that one of O(1/ε) dyadic interval J_(q) of lengthθ(ε|I|)=θ(εx_(q)) around h(q) has (1+δ/2)|J_(q)| keys. Here a dyadicinterval is one of the form [i2^(p),(i+1)2^(p)) for some integer powerp. We index the potential dyadic intervals −k, . . . , 0, . . . , k,k=(1/ε) with 0 representing the one containing h(q), and −1 and +1 forits neighbors and so forth. It is important that the indexing is thesame no matter the size of I. Over all queries q, there are only(2k+1)^(|Q|) choices of indices. Assume that we have guessed the rightbad index for every query q, pointing out a specific dyadic intervalJ_(q) relative to h(q). Our indexed bad event is that the non-querygroups contribute (1+δ/2)a|J_(q)| keys to J_(q) for every q.

We shall use the following notation. For each non-group G and query q,we let Y_(G,q) denote the contribution of G to J_(q). Summing over thenon-query groups, we define Y_(q)=

Y_(G,q). With μ_(q)=a|J_(q)|, we get E[Y_(q)]≦a|J_(q)| and in the badevent, Y_(q)≧(1+δ/2)a|J_(q)|.

From Proposition 9 and Remark 11, we get that after the hashing of thetails and the query heads (including the h(q))), the contributionY_(G,Q)=Σ_(qεQ)Y_(G,q) of a non-query group G to all the query intervalsJ_(q) ⊂[h(q)−θ(Σ^(φ/3)),h(q)+θ(Σ^(φ/3))] is O(1). The Y_(G,Q), Gε

, are independent random variables to be fixed when we hash thenon-query group heads.

Let Y=

_(,qεQ)Y_(G,q)=

Y_(G,Q)=Σ_(qεQ)Y_(q) be the combined contribution of all non-querygroups to all the query intervals J_(q). Then Y is the sum ofindependent O(1) variables Y_(G,Q), so Chernoff hounds may be applied.With μ=Σ_(qεQ)μ_(q), we get E[Y]≦μ while the bad event impliesY≧(1+δ/2)/μ=(1+ε/2)μ.Finally

$\mu = {{\sum\limits_{q \in Q}\;{\alpha{J_{q}}}} = {{\sum\limits_{q \in Q}\;{\Theta\left( {ɛ\; x_{q}} \right)}} = {{\Theta\left( {ɛ\; x} \right)} = {{\Theta(x)}.}}}}$It now follows from Chernoff bound that the probability of the indexedhad event is exp(−Ω(ε²x)).

The had event from (6) implies that one of the (2k+1)^(|Q|) indexvectors are bad, so be the union bound, we get an overall probabilitybound of(2k+1)^(|Q|)exp(−Ω(ε² x))=2^(O(|Q|))exp(−Ω(ε² x)).The derivation exploits that k=O(1/ε)=O(1). This completes the proof of(6), hence of (5) and Theorem 10.ε=o(1). We will now consider that much more intricate case where ε=o(1).Recall our had event. With δ=ε/2, for each query q, we are looking foran interval I_(q) of size θ(x_(q)) so that the contribution X_(G,I) _(q)from the non-query groups in

to I_(q) is at least (1+δ)a|I_(q)|.

The single query version of this case was also studied in [PT11].Instead of considering θ(1/ε)=w(1) dyadic intervals, we look at dyadicintervals on different levels. From [PT11], we get the combinatorialstatement that if an interval I_(q) of length θ(x_(q)) has (1+δ)a|I_(q)|(non-query) keys, then for some “level” i≧0, there is one of 2^(i) thedyadic intervals J_(q) of length θ(x_(q)/2^(i)) around h(q) which has

$\left( {1 + {\delta 2}^{\frac{4}{5}i}} \right)\alpha{J_{q}}$(non-query) keys, the point here is that the relative deviation

$\delta_{i} = {\delta 2}^{\frac{4}{5}i}$grows with the number 2^(i) of intervals considered. As stated in[PT11], we only need to consider levels i where δ_(i)≦1, so for ε=Ω(1),there would only be a constant number of levels and dyadic intervals.

For each query q, we guess a level i_(q), and an index j_(q)=O(2^(i)^(q) ) of a dyadic interval J_(q). We need to sum the probabilities overall such combination of levels and indices, of the indexed had eventthat we for every q end up with (1+δ_(i) _(q) )a|J_(q)| non-query keysin J_(q). Below we first focus on one such indexed had event and definethe same random variables as we did with ε=Ω(1). We have Y_(G,q)denoting the contribution of non-query group CT to J_(q), Y_(q)=

Y_(G,q), Y_(G,Q)=Σ_(qεQ)Y_(G,q) and Y=Σ_(qεQ)Y_(q)=

Y_(G,Q). From Proposition 9 and Remark 11, we again get that everyY_(G,Q) is bounded by some constant d. For each q, we also we defineμ_(q)=a|J_(q)|, so E[Y_(q)]≦μ_(q) while the bad event impliesY_(q)≧(1+δ_(i) _(q) )μ_(q).

We will derive some special Chernoff bounds tailored to handle all thequeries despite the different δ_(iq). The calculations are aimed to looklike those in the standard proofs of Chernoff bounds. The interestingthing is all the arguments between the calculations explaining how thestandard calculations can be applied.

To standardize the calculations, we first normalize, dividing all theabove contributions by d. Using ′ to denote this normalization. ThenY′_(G,Q)≦1 w.h.p. after the hashing of the tails and query heads arefixed. In this situation, the contributions from different non-querygroups are independent variables to be fixed when we hash their groupheads. More precisely, we have that the vectors (Y′_(G,q))_(qεQ) fordifferent G are independent of each other. For contrast we note that fora given G, and different queries q₁ and q₂, the variables Y′_(G,q) ₁ andY′_(G,q) ₂ may be highly correlated.

We are trying to bound the bad event that for all qεQ simultaneously,Y′_(q)>(1+δ_(i) _(q) )μ′_(q). We study the quantity Π_(qεQ)(1+δ_(i) _(q))^(Y′) ^(q) . Our had event implies that this quantity exceedsΠ_(qεQ)(1+δ_(i) _(q) )^((1+δ) _(i) _(q) )μ′ ^(q) , so by Markov'sinequality, the probability of the had event is bounded by

$\frac{E\;\left\lbrack {\prod\limits_{q \in Q}\;{\left( {1 + \delta_{i_{q}}} \right)Y_{q}^{\prime}}} \right\rbrack}{\prod\limits_{q \in Q}\;\left( {1 + \delta_{i_{q}}} \right)^{{({1 + \delta_{i_{q}}})}\mu_{q}^{\prime}}}$Using the independence of the vectors (Y′_(G,q))_(qεQ) for different C,we get

${E\;\left\lbrack {\prod\limits_{q \in Q}\;{\left( {1 + \delta_{i_{q}}} \right)Y_{q}^{\prime}}} \right\rbrack} = {{E\left\lbrack {\prod\limits_{G \in \mathcal{G}}{\prod\limits_{q \in Q}\;{\left( {1 + \delta_{i_{q}}} \right)Y_{G,q}^{\prime}}}} \right\rbrack} = {\prod\limits_{G \in \mathcal{G}}{{E\left\lbrack {\prod\limits_{q \in Q}{\left( {1 + \delta_{i_{q}}} \right)Y_{G,q}^{\prime}}} \right\rbrack}.}}}$For a given G, we now study

$\Phi_{G} = {{E\left\lbrack {\prod\limits_{q \in Q}\;{\left( {1 + \delta_{i_{q}}} \right)Y_{G,q}^{\prime}}} \right\rbrack}.}$We know that the E[Y′_(G,q)]≦μ′_(G,q) and that Σ_(qεQ)Y′_(G,q)≦1.Subject to these constraints, we claim that the distribution with thelargest mean consists of values that are either one 1 and the 0selsewhere, or all 0s. To prove this, we will transform any distributioninto this form, without decreasing the mean Φ_(G). Thus consider somedistribution with an event {y′_(G,q)}_(qεQ)={Y′_(G,q)}_(qεQ) happeningwith probability p>0. Suppose for that it is not of the above form. Lets′_(G)=Σ_(qεQ)y′_(G,q). First consider the case where 0<s′_(G)<1. Sinceevery δ_(i) _(q) >0, we have Π_(qεQ)(1+δ_(i) _(q) )^(y′) ^(G,q) >1. Byconvexity this implies that we get a higher Φ_(G) if we locally, withprobability s′_(G) scale all y′_(G,q) up by a factor s′_(G), and use theall 0s event otherwise. Thus, we can assume Σ_(qεQ)y′_(G,q)=1. Next, bythe weighted arithmetic-geometric mean inequality, using the y′_(G,q) asweights, we have

${\prod\limits_{q \in Q}\;\left( {1 + \delta_{i_{q}}} \right)^{y_{G,q}^{\prime}}} \leq {\sum\limits_{q \in Q}\;{{y_{G,q}^{\prime}\left( {1 + \delta_{i_{q}}} \right)}.}}$However, the right hand side is exactly the contribution to Φ_(G) if welocally replace the event {y′_(G,q)}_(qεQ)={Y′_(G,q)}_(qεQ) with adistribution which has Y′_(G,q) as the only 1 and 0s elsewhere withprobability y′_(G,q). Both transformations preserve the means of theY′_(G,q) and can only increase Φ_(G), and at the end, we get adistribution of the desired form. In the overall worst-casedistribution, we have Y′_(G,q) as the only 1 and 0s elsewhere withprobability E[Y′_(G,q)]≦μ′_(G,q). Therefore

$\begin{matrix}{\Phi_{G} = {E\left\lbrack {\underset{q \in Q}{\Pi}\left( {1 + \delta_{i_{q}}} \right)}^{Y_{G,q}^{\prime}} \right\rbrack}} \\{\leq {\left( {\sum\limits_{q \in Q}{\mu_{G,q}^{\prime}\left( {1 + \delta_{i_{q}}} \right)}} \right) + \left( {1 - {\sum\limits_{q \in Q}\mu_{G,q}^{\prime}}} \right)}} \\{= {1 + \left( {\sum\limits_{q \in Q}{\mu_{G,q}^{\prime}\delta_{i_{q}}}} \right)}} \\{\leq {\exp\left( {\sum\limits_{q \in Q}{\mu_{G,q}^{\prime}\delta_{i_{q}}}} \right)}}\end{matrix}$Reordering terms, we now get the following probability bound for our badevent.

$\begin{matrix}{\frac{E\left\lbrack {\prod\limits_{q \in Q}\;\left( {1 + \delta_{i_{q}}} \right)^{Y_{q}^{\prime}}} \right\rbrack}{\prod\limits_{q \in Q}\;\left( {1 + \delta_{i_{q}}} \right)^{{({1 + \delta_{i_{q}}})}\mu_{q}^{\prime}}} = \frac{\prod\limits_{G \in \mathcal{G}}\;{E\left\lbrack {\prod\limits_{q \in Q}\;\left( {1 + \delta_{i_{q}}} \right)^{Y_{G,q}^{\prime}}} \right\rbrack}}{\prod\limits_{q \in Q}\;\left( {1 + \delta_{i_{q}}} \right)^{{({1 + \delta_{i_{q}}})}\mu_{q}^{\prime}}}} \\{\leq \frac{\prod\limits_{G \in \mathcal{G}}\;{\exp\left( {\sum\limits_{q \in Q}{\mu_{G,q}^{\prime}\delta_{i_{q}}}} \right)}}{\prod\limits_{q \in Q}\;\left( {1 + \delta_{i_{q}}} \right)^{{({1 + \delta_{i_{q}}})}\mu_{q}^{\prime}}}} \\{= \frac{\prod\limits_{q \in Q}\;{\exp\left( {\sum\limits_{G \in \mathcal{G}}{\mu_{G,q}^{\prime}\delta_{i_{q}}}} \right)}}{\prod\limits_{q \in Q}\;\left( {1 + \delta_{i_{q}}} \right)^{{({1 + \delta_{i_{q}}})}\mu_{q}^{\prime}}}} \\{= {\prod\limits_{q \in Q}\frac{\exp\left( {\mu_{q}^{\prime}\delta_{i_{q}}} \right)}{\left( {1 + \delta_{i_{q}}} \right)^{{({1 + \delta_{i_{q}}})}\mu_{q}^{\prime}}}}} \\{= {\prod\limits_{q \in Q}\left( \frac{\exp\left( \delta_{i_{q}} \right)}{\left( {1 + \delta_{i_{q}}} \right)^{1 + \delta_{i_{q}}}} \right)^{\mu_{q}/d}}} \\{= {\prod\limits_{q \in Q}\;{{\exp\left( {- {\Omega\left( {\delta_{i_{q}}^{2}\mu_{q}} \right)}} \right)}.}}}\end{matrix}$Incidentally, this is the same probability hound we would have gotten ifthe contributions to each J_(q) where independent, which is certainlynot the case. As a final step, recall that μ_(q)=a|J_(q)|=θ(x_(q)/2^(i)^(q) ). Moreover,

${{\delta_{i_{q}}^{2}{x_{q}/2^{i_{q}}}} = {{ɛ^{2}2^{\frac{8}{5}i_{q}}{x_{q}/2^{i_{q}}}} = {ɛ^{2}2^{\frac{3}{5}i_{q}}x_{q}}}},$so we can rewrite our probability bound to

$\prod\limits_{q \in Q}\;{{\exp\left( {- {\Omega\left( {ɛ^{2}2^{\frac{3}{5}i_{q}}x_{q}} \right)}} \right)}.}$We now have to sum the above probabilities over all combinations wherewe for each query q pick a level i_(q), and an index j_(q) picking ofone out of the 2^(i) ^(q) dyadic intervals. Hence we can compute thecombined probability as

$\begin{matrix}{\Pr\left\lbrack {{\forall{q \in {Q:X_{q}}}} = {\Theta\left( x_{q} \right)}} \right\rbrack} \\{= {\underset{{(i_{q})}_{q \in Q}}{\Sigma}\left( {\underset{q \in Q}{\Pi}\left( {2^{i_{q}}{\exp\left( {- {\Omega\left( {ɛ^{2}2^{\frac{3}{5}i_{q}}x_{q}} \right)}} \right)}} \right)} \right.}} \\{= {{\underset{q \in Q}{\Pi}\left( {\underset{i}{\Sigma}\left( {2^{i}{\exp\left( {- {\Omega\left( {ɛ^{2}2^{\frac{3}{5}i_{q}}x_{q}} \right)}} \right)}} \right)} \right)}.}}\end{matrix}$Next, concerning the sum for given q, recall from (7) that we may assumex_(q)>b/ε² for any constant b. Then ε²x_(q)=b and with b large enough,the terms

$2^{i}{\exp\left( {- {\Omega\left( {ɛ^{2}2^{\frac{3}{5}i_{q}}x_{q}} \right)}} \right)}$decreases rapidly with i. Therefore

$\underset{i}{\Sigma}\left( {{2^{i}{\exp\left( {- {\Omega\left( {ɛ^{2}2^{\frac{3}{5}i_{q}}x_{q}} \right)}} \right)}} = {O\left( {{\exp\left( {- {\Omega\left( {ɛ^{2}x_{q}} \right)}} \right)}.} \right.}} \right.$Thus our probability bound simplifies to

$\begin{matrix}{\Pr\left\lbrack {{\forall{q \in {Q:X_{q}}}} = {\Theta\left( x_{q} \right)}} \right\rbrack} \\{= {\underset{q \in Q}{\Pi}{O\left( {\exp\left( {- {\Omega\left( {ɛ^{2}x_{q}} \right)}} \right)} \right)}}} \\{= {2^{O{({Q})}}{\exp\left( {- {\Omega\left( {ɛ^{2}\underset{q \in Q}{\Sigma}x_{q}} \right)}} \right)}}} \\{= {2^{O{({Q})}}{{\exp\left( {- {\Omega\left( {ɛ^{2}x} \right)}} \right)}.}}}\end{matrix}$This completes the proof of (6), hence of (5) and Theorem 10.

FIG. 3 is a schematic illustrating still more exemplary embodiments.FIG. 3 is a generic block diagram illustrating the monitoringapplication 30 operating within a processor-controlled device 100. Asthe paragraphs explained, the monitoring application 30 may operate inany processor-controlled device 100. FIG. 3, then, illustrates themonitoring application 30 stored in a memory subsystem of theprocessor-controlled device 100. One or more processors communicate withthe memory subsystem and execute the recommender application 26. Becausethe processor-controlled device 100 illustrated in FIG. 3 is well-knownto those of ordinary skill in the art, no detailed explanation isneeded.

Exemplary embodiments may be physically embodied on or in acomputer-readable storage medium. This computer-readable medium mayinclude CD-ROM, DVD, tape, cassette, floppy disk, memory card, andlarge-capacity disks. This computer-readable medium, or media, could bedistributed to end-subscribers, licensees, and assignees. A computerprogram product comprises processor-executable instructions formonitoring data, as explained above.

While the exemplary embodiments have been described with respect tovarious features, aspects, and embodiments, those skilled and unskilledin the art will recognize the exemplary embodiments are not so limited.Other variations, modifications, and alternative embodiments may be madewithout departing from the spirit and scope of the exemplaryembodiments.

What is claimed is:
 1. A system, comprising: a processor; and memorystoring code that when executed causes the processor to performoperations, the operations comprising: retrieving a hash function;generating a tabulation of the hash function; generating keys from thetabulation, with each of the keys having characters; denoting a first ofthe characters of each of the keys as a head; twisting the head of eachof the keys to generate a twisted hash function; and hashing data usingthe twisted hash function.
 2. The system according to claim 1, whereinthe operations further comprise denoting remaining ones of thecharacters as a tail of the keys.
 3. The system according to claim 2,wherein the operations further comprise xoring the tail of the keys. 4.The system according to claim 1, wherein the operations further compriseinterpreting each of the keys as a vector.
 5. The system according toclaim 2, wherein the operations further comprise hashing the tail andfixing the head.
 6. The system according to claim 2, wherein theoperations further comprise generating a twisted key by twisting thehead of one of the keys according to the tail of the one of the keys. 7.The system according to claim 6, wherein the operations further comprisetabulating the twisted key.
 8. A method, comprising: retrieving, frommemory, a hash function; retrieving, from the memory, keys thatcorrespond to data to be classified using the hash function; denoting,by a processor, a first character of each of the keys as a head;denoting, by the processor, remaining characters of each of the keys asa tail; twisting, by the processor, the head of one of the keysaccording to the tail of the one of the keys to generate a twisted hashfunction; and hashing the keys using the twisted hash function.
 9. Themethod according to claim 8, further comprising generating the keys fromthe data.
 10. The method according to claim 8, further comprising xoringthe tail of the one of the keys.
 11. The method according to claim 8,further comprising interpreting each of the keys as a vector.
 12. Themethod according to claim 8, further comprising hashing the tail andfixing the head.
 13. The method according to claim 8, furthercomprising: xoring tails of all the keys to generate a result; andxoring heads of all the keys with the result.
 14. The method accordingto claim 8, further comprising: xoring tails of all the keys to generatea result; and xoring the head of the one of the keys with the result.15. A memory storing instructions that when executed cause a processorto perform a method, the method comprising: retrieving a hash function;retrieving keys that correspond to data to be classified using the hashfunction; denoting a first character of each of the keys as a head;denoting remaining characters of each of the keys as a tail; twistingthe head of one of the keys according to the tail of the one of the keysto generate a twisted hash function; and hashing the keys using thetwisted hash function.
 16. The memory according to claim 15, furthercomprising instructions for generating the keys from the data.
 17. Thememory according to claim 15, further comprising instructions for xoringthe tail of the one of the keys.
 18. The memory according to claim 15,further comprising instructions for interpreting each of the keys as avector.
 19. The memory according to claim 18, further comprisinginstructions for: xoring tails of all the keys to generate a result; andxoring heads of all the keys with the result.
 20. The memory accordingto claim 15, further comprising instructions for: xoring tails of allthe keys to generate a result; and xoring the head of the one of thekeys with the result.